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Abstract 

A generalization of numeration system in which N is recognizable 
by finite automata can be obtained by describing a lexicographically 
ordered infinite regular language. Here we show that if P £ Q[x] is a 
polynomial such that P{N) C N then we can construct a numeration 
system in which the set of representations of P(N) is regular. The 
main issue in this construction is to setup a regular language with a 
density function equals to P{n + 1) — P(n) for n large enough. 

1 Introduction 

Recently, P. Lecomte and I have introduced in |5| the concept of numeration 
system on a regular language. A numeration system is a triple (L,S,<) 
where L is an infinite regular language over a totally ordered finite alphabet 
(S,<). The lexicographic ordering of L gives a one-to-one correspondence 
Is between the set of the natural numbers N and the language L. 

For each n G N, rs{n) denotes the (n + 1)*^ word of L with respect to 
the lexicographic ordering and is called the S -representation of n. 

For w € L, we set vals{w) = ig^(w) and we call it the numerical value 
of w. 

When one has a simple method to represent integers, some natural ques- 
tions about "recognizability" arise. By recognizability, one means the fol- 
lowing. Let S be a numeration system and X be a subset of N. Then X 
is said to be S -recognizable if r5(X) is recognizable by a finite automaton. 
Therefore we can consider two kinds of questions. 
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O For a given numeration system S, is it possible to determine which 
subsets of N are 5-recognizable ? 

O For a given subset X of N, is it possible to find a numeration system 
S in which X is S-recognizable ? 

To give a partial but very important answer to the first question, it 
is shown in ^ that arithmetic progressions are always recognizable in any 
numeration system. It is also shown that if X is recognizable for some system 
S then X + k is also S'-recognizable. (These two results will be useful in 
some proofs of this paper.) 

In Q, we were interested in the second question when X is the set V 
of primes. It is shown that ^siV) is never recognizable for any numeration 
system S. In this paper, we will be mainly concerned by the second question 
when X is a polynomial image of N. 

For classical numeration systems with integer base, it is well-known that 
the set of the perfect squares is not fc-recognizable for any A; G N \ {0, 1} 
(see for a survey about classical numeration systems). However, in ||5| we 
show quite easily that the numeration system 

S = {a*b* U a*c*, {a, b,c},a <b < c) 

is such that the set rs{{n'^ ■ n G N}) is regular. The choice of the language 
a*h* U a*c* was given by some density considerations: this language has 
exactly 2n + 1 words of length n. In view of this result, J. -P. Allouche asked 
the following question. Is it possible to generalize the result about the set 
of the perfect squares to the set {n^ : n E N}, k > 2 7 Moreover, if P is a 
polynomial belonging to N[x] (resp. Z[x] or Q[x]) such that P(N) C N then 
can one find a numeration system such that P(N) is recognizable ? 

In all these cases, we answer affirmatively. For a given polynomial P, 
we give an explicit method to construct a numeration system such that 
r5(P(N)) is regular. For this purpose, we show how to obtain a regular 
language which contains exactly P{n + 1) — P{n) words of length n for n 
large enough. The construction of regular languages with specified density 
is a problem beyond the concern of numeration systems. 

The fact that the set of primes is never recognizable and that the polyno- 
mial images of N are recognizable give another interpretation of a well-known 
result (see [Q, Theorem 21]): no non constant polynomial f{n) with integral 
coefficients can be prime for all n, or for all sufficiently large n. 
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2 Recognizability of polynomials 



Our aim will be to construct a numeration system in which P(N) is recog- 
nizable when P G Q[x] and P(N) C N. 

We will proceed in four steps. First of all, we give an explicit itera- 
tive method to obtain regular languages such that the number of words of 
length n is exactly (in Q it is said that such languages can be easily 
obtained). The languages which are given here can be interpreted as the 
basic constructors of our method. 

In the three other steps, we increase gradually the difficulty. First we 
consider the case P G N[x] which is quite simple since we only deal with 
the operation of addition. Next we consider P € here the problem 

of substraction must be resolved. Finally, we have the most general case, 
P G Q[x] and the problem of division. In each of these last three steps, we 
give an instructive short example of construction. 

i) Languages with density 

First we recall some basic definitions and operations on languages. 
Definition 1 The density function of a language L C S* is 

PL :N^N:n^#(S"nL) 
where i^A denotes the cardinality of the set A. 

Definition 2 If x and y are two words of S* then the shuffle of x and y is 
the language xJIy defined by 

{xiyi . . . Xnyn ■ X = xi - ■ ■ Xn,y = yi - ■ - yn^Xi^yi £T,*,l < i < n,n> 1}. 

If Li,L2 Q Ti* then the shuffle of the two languages is the language 

LiJIL2 = {w(zT,*:w(zxILy, for some x € Li,y G L2}. 

Recall that if Li,L2 are regular then Li U L2 is also regular (see for 
instance |^ Proposition 3.5]). 

Definition 3 Let L C S*. Then T, is the minimal alphabet of L if Va G S, 
3w £ L : w = uav, n, w G S*. 
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We want to construct regular languages Lfc such that /9Lj.(n) = n*^. The 
first two languages are, for example, Lq = a* and Li = a'^b*. 

To construct a language L2, we first need a language M2 such that 
PMii^) = n + 1. We can take M2 = a*h* . Hence L2 = M2 U {c}. Indeed 
if one considers the words of length n belonging to L2, they are obtained 
from n distinct words of length n—1 belonging to M2 and for each of these 
words, c can be positioned in n different places. Thus one has exactly "n? 
words of length n in L2. As an example, we have below the construction of 
the nine words of length 3, 

a*h* a*b* U {c} 

aa — > aac, aca, caa 

ah abc, acb, cab 

bb bbc, bcb, ebb. 

Observe that the letter c docs not belong to the minimal alphabet of M2. 

To construct L3, we simply need a language M3 such that pM^in) = 
(n+ 1)^. This can be done using the previously defined languages Lq, -Li, -L2, 
each of them written on a different alphabet, 

M3 = {a*b* n {c}) U d+e* U U h* . 

p{n)=v? p(n)=2n Pin)=l 

Then we have L3 = M3 11 {i}. 

This procedure can be repeated and thus for any k > 2, can be 
obtained as a union of previously constructed languages and one operation 
of shuffle with a new letter. 

In the following, the notations and will refer to the previously 
constructed languages such that pM^in) = (n + l)'^"^ and PLf,{n) = n^- 

Remark 1 Let -Ufe be the size of the minimal alphabet of Lj.. The construc- 
tion of Lfe gives 

uo = 1, ui = 2, U2 = 3, 

m— 1 

Um= + 1, Vm>3. 

By direct inspection, one can check that = 9, ^4 = 26, W5 = 90 < 5! and 
forn = 6, . . . , 10, u„ < n\. Let m > 11. Since {^~'^) < {"^~^) for i < 4; one 
has easily, by recurrence on m, the following upper bound 

< X] ^' (^"^ ^ = e r(m, 1) < e (m - 1)! 
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where T{m, 1) is the incomplete gamma function defined by 

/ + 00 
i°-^e-* dt. 

Remark 2 In view of an earher version of this paper, J. Shalht suggested 
another construction of a language K such that pxin) = n^. It uses the 
following result (see |jl], Section 6.5]) 

k 

= Y^t\S{k,t) 

t=o 

where S{k, t) are the Stirling numbers of the second kind. The language over 
{a,b} with all strings of length n containing exactly t letters b is regular and 
has a density p{n) = ("). Therefore a union of such languages on distinct 
alphabets gives the language K. 

This construction is perhaps simpler than the construction of L/. but uses 
a greater alphabet. The size of the minimal alphabet is maX(=o.... ,k t^- S{k, t) 
and a lower bound is given by k\. We won't use it in the following. 

a) Recognizability of polynomials belonging to N[a:] 

The main idea is that we have to find a regular language such that the posi- 
tions of the first words of each length are the values taken by the polynomial. 

Proposition 4 Let P € If P(N) C N then there exists a numeration 

system S = {L, S, <) such that P(N) is S -recognizable. 

Proof. Since the translation by a constant doesn't alter the recognizablity of 
a set, as recalled in the introduction (see ^] for details), we can assume that 
P(0) = 0. We have to construct a regular language L such that the number 
of words of length n is exactly P{n + 1) — P{n). Since P{n + 1) — P{n) only 
contains powers of n with non- negative integral coefficients, the construction 
of L can be easily achieved by union of languages L/j on distinct alphabets 
(one has a small restriction for the language Lq ; we explain it in the following 
example to keep this proof simple). To conclude the proof, the reader must 
recall that if a language L is regular then the language T{L) formed of the 
smallest words of each length for the lexicographic ordering is still regular 
[0. One can check that r5(P(N)) = I{L). □ 
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Example 1 Let P{x) = 2x'^ + 3x. Then 

P{x + 1)-P{x) = 4x + 5. 

We consider the language L which is formed by four copies of Li and five 
copies of Lq. 

A very important remark is that with five copies of Lq, we obtain five 
words of any positive length but the only one empty word e. So to get rid of 
this problem we add to our language four new words of length 1 (wc thus 
add four letters to the alphabet). This remark applies for all the following 
constructions: if one uses n copies of Lq then add ra — 1 words of length 1 
and treat the case n = 1 separately. 

One can check that forn 7^ 1, the first word of length n is the [-P(n) + if^ 
word of L and 

r5(P(N\{l}))=X(L\S). 

Therefore rs'(P(N)) is regular since we only add one word for ts{P{1)) to a 
regular language. 

Corollary 5 Let A; G N \ {0, 1}. There exist a numeration system S such 
that the set {x* : a; G N} is S -recognizable. □ 

Hi) Recognizability of polynomials belonging to Z[x\ 

This lemma gets rid of the problem of the coefficients belonging to Z instead 
of N. 

Lemma 6 Let k and a be two positive integers. There exist a regular lan- 
guage L such that pcin) = — an^~^ for all n> a. 

Proof. Assume that k > 2. Let be the minimal alphabet of M^. Then 
Lk = Mk U {a} where a ^ S^. For i = 1,. . . ,n, L^ has exactly n''~^ words 
of length n with a in position i. Prom this observation, one can check that 

0-1 

£ = Lk\\jn<^n 

i=0 

have exactly — a n*^"^ words of length n forn > a. Notice that pc{n) = 
if n < a. 
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If k = 1 then we have to remove the a first words of each length from 



first words second words 

of each length of each length 

Notice one more time that pciiT-) = if n < a. □ 

Proposition 7 Let P S Z[x]. If P(N) C N then there exists a numeration 
system S = {L, S, <) such that P(N) is S -recognizable. 

Proof. We proceed as in Proposition ^ and consider the polynomial Q{n) = 
P{n + 1) — P{n). Observe that since P(N) C N, the coefficient of the 
dominant power in P is positive and thus the same remark holds for Q. By 
adding extra terms of the form — x^ , if deg(Q) = k we can assume that 

k 

Q{x) = x'^+^ -ai,x'^ + --- + x'^+^ - ai,. x'^ + &z 

1=0 

where ii,. . . , v £ {0, . . . , /c — 1}, Oj^, . . . , Oj^ G N \ {0} and bo, . . . ,bk € ^. 
Let a = su'Pj^i :^ai.. Using Lemma ^, for j = 1,... ,r we construct 
languages Cj such that for all n > a, pcj{n) = n*^"*"^ — ai - n*^. The reader 
can construct easily a language L such Vn > a, pL^n) = Q{n) by union of 
languages Cj and Li. 

If we want to consider the smallest word of each length, as in Proposition 
then the language L must contain exactly P{a) words of length at most 
a — 1 (in this case, the first word of length a is the [P{a) + 1]*'' word of 
L and its numerical value is thus P{a)). This can be achieved by adding 
or removing a finite number of words from the regular language L (this 
operation doesn't alter the regularity of L). Thus 

is{{P{n) : n > a}) = I{L) n S^". 

To conclude we have to add a finite number of words for the representa- 
tion of P(0),... ,P{a-l) and 

rsiPm = {I{L) n S^-) U {rs(P(0)), . . . , Ts{P{a - 1))}. 

□ 
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Example 2 Let P{x) = x"^ - 3x^ - 2x + 5. Then 

Q{n) = P{n + 1) - P{n) = Ax^ + 6x^-2x-A 

= ix^ + bx"^ + x"^ - 3x + X - 4. 

With four copies of L3, five copies of L2 and using Lemma ^, one can con- 
struct a regular language L such that[| 

,._j4n^-|-6n^-2n-4 ifn>4 
Pl{it-) -j^ 4n^-|-5n^ otherwise. 

We have -P(4) = 205 and the number of words of length at most 3 
belonging to L is 214 thus we remove 9 words of length at most 3 in L. 
Therefore, the first word of length 4 in L is the representation of -P(4) and 

Ts{{P{n):n>4})=I{L)n^^' (1) 

is a regular subset of L. Since {P(0),... ,P{3)} is equal to {1,5,53}, we 
add the second, the 6*^ and the 54*'' word of L to (|l|) to obtain r5(P(N)). 



Example 3 We begin another example which show how to obtain a correct 



expression for piin) in a trickier situation. Let P{x) = x 
then 



Ax-^ -2x^ + 8, 



Q{x) = 5 x^ + 9 x^ + x^ - 3 x^ + x'^ - 12 X + X - 5. 

To construct a language L, we use five copies of L4, nine copies of L3 and 
apply three times Lemma g. Thus 



PLin) 



Q{n) 

5 + 10 V? 
5 + 10 
5 n'' + 9 



3 n + n — 5 
3n2 



if n > 12 
if 12 > n > 5 
if 5 > n > 3 
otherwise. 



iv) Recognizability of polynomials belonging to Q[x] 

Finally, we obtain the theorem of recognizability in the general case. 

^Here the expression of Pi(n) is very simple since 3 and 4 only differ by one unit 
(remark that 4n^ + 6n^-2n-4 = 4n^+6n^-37i<4>n = 4 and 4Ti^+6n^-3n = 
4n^ + 5n^ n = 3 or 0). 
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Theorem 8 Let P € Q[x\. If P(N) C N then there exists a numeration 
system S = (L, S, <) such that P(N) is S -recognizable. 

Proof. Let 

Ofc Ofc-i bo 

with 6o)-- - ibk,ak € N \ {0} and ao, . . . G ^- Let s be the least 

common multiple of 6o) • • • > ^fc- One has 

s 

with P' e By hypothesis P(N) C N; thus P'(N) C sN. As in Proposi- 

tion |7|, there exist a constant a and a language L' such that Vn > a, 

p^,(n) = P'(n + 1) - P'(n) = s[P{n + 1) - P(n)]. 

We modify L' (by adding or removing a finite number of words) to have 

Q-l 

i=0 

It was proved in that the arithmetic progression sN is recognizable for 
any numeration system. Let S' = (L',S,<) then L = r5'(sN) is a regular 
language such that 

= P(a) and Vn > a, piin) = P{n + 1) — P{n). 

i=0 

We conclude as in Proposition |^. □ 
Example 4 Let 

„ , . X „ Q 3T o XT 

^ ^ 3 6 2 

1 17 

= -{x-7)x^{x + l) + —x{x-l)+A. 

The reader can check easily that P(N) C N. We have s = 6 and 

P' {n + I) - P' {n) = 8n^-24n2 + 46n-24 

= 7n^ +45n + - 24n^ +n - 24. 
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Using seven copies of L3, 45 copies of Li and applying Lemma |6| twice, we 
construct a language L' such that 

f 6(P(n + 1) - P(n)) ifn>24 
Pi'l^-J — I 7n3 + 45n otherwise. 

The number of words of length at most 23 in L' is 545652 and 6P(24) = 
517776. Thus we remove 27876 words from L' n S^23_ ^.j^jg 

new language 

lexicographically ordered, we only take the words at position 6i + 1, i € N, 
to obtain the regular language L. Thus the [-P(24) + 1]*^ word of L is the 
first word of length 24 belonging to L and 

Ts{{P{n) : n > 24}) = T(L) n S-^^. 

To conclude, we have as usual to add a finite number of words for the 
representation of P{0), . . . , P{23). 

Remark 3 In , we have studied the problem of changing the ordering of 
the alphabet and we have exhibit some subset X of N and some numeration 
systems S and S' which only differ by the ordering of the alphabet such that 
r5(X) is regular and rs'{X) not. 

This kind of singularity doesn't appear here. For a given polynomial 
P, we have shown how to construct a particular numeration system S = 
(-L, S, <) such that P(N) is S'-recognizable. By construction, one can easily 
check that P(N) is also T-recognizable for any system T = (L, S, ^) where 
^ is a reordering of S. 
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